Fix capacityType label in AWS ManagedNodeGroup #6261

sibucan · 2023-11-08T16:36:03Z

What type of PR is this?

/kind bug

What this PR does / why we need it:

When using a ManagedNodeGroup (MNG), CAS is supposed to use the EKS's DescribeNodegroup API call to get the information about the nodes in the NodeGroup to know where a Pod is supposed to go if no nodes currently exist for the MNG. This is because there's currently no native way of setting tags on a MNG's underlying ASG using the MNG API, and CAS traditionally depended on the ASG's tags to be able to tell when it should scale up from 0 when a Pod could potentially fit there.

However, the capacityType label that CAS gets from the DescribeNodegroup API call is missing a crucial prefix. A Spot instance node will have the eks.amazonaws.com/capacityType label placed on it, but when CAS gets the node label list from the API, it'll look like this:

I1005 17:40:24.332839       1 aws_manager.go:310] node.Labels : map[amiType:BOTTLEROCKET_x86_64 capacityType:SPOT cluster-name:eks01 eks.amazonaws.com/nodegroup:bottlerocket_x86_spot-20231003194631949400000003 k8sVersion:1.28 kubernetes.io/arch:amd64 kubernetes.io hostname:eks-bottlerocket_x86_spot-20231003194631949400000003-bcc57b89-f4ef-9c63-7c70-b4ab5325eb78-asg-6921437539965377742 kubernetes.io/os:linux node.kubernetes.io/instance-type:m5.xlarge nodegroup-name:bottlerocket_x86_spot-20231003194631949400000003 topology.kubernetes.io/region:us-east-1 topology.kubernetes.io/zone:us-east-1a]

Note that the label CAS determined from the API call was:

capacityType:SPOT

In a real node, the labels look like this (taken from kubectl describe node):

Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=m6i.xlarge
                    beta.kubernetes.io/os=linux
                    eks.amazonaws.com/capacityType=SPOT
                    eks.amazonaws.com/nodegroup=bottlerocket_x86_spot-20231003194631949400000003
                    eks.amazonaws.com/nodegroup-image=ami-09c9ab15237fd5960
                    eks.amazonaws.com/sourceLaunchTemplateId=lt-0d333d9b550f53aad
                    eks.amazonaws.com/sourceLaunchTemplateVersion=3
                    failure-domain.beta.kubernetes.io/region=us-east-1
                    failure-domain.beta.kubernetes.io/zone=us-east-1a
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=ip-10-250-242-179.ec2.internal
                    kubernetes.io/os=linux
                    node.kubernetes.io/instance-type=m6i.xlarge
                    topology.ebs.csi.aws.com/zone=us-east-1a
                    topology.kubernetes.io/region=us-east-1
                    topology.kubernetes.io/zone=us-east-1a

So if I tried to schedule a Pod using the following nodeaffinity:

      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                - key: "eks.amazonaws.com/capacityType"
                  operator: In
                  values: ["SPOT"]

The CAS will see that my Spot MNG (with discovered label capacityType: SPOT) will "never" be able to fit my Pod that requests eks.amazonaws.com/capacityType: SPOT, and says "we can't fit your Pod in the cluster":

I1005 17:56:50.997360       1 orchestrator.go:546] Pod autoscaler-test/test-59f845b4f8-qvtll can't be scheduled on eks-bottlerocket_x86_spot-20231003194631949400000003-bcc57b89-f4ef-9c63-7c70-b4ab5325eb78, predicate checking error: node(s) didn't match Pod's node affinity/selector; predicateName=NodeAffinity; reasons: node(s) didn't match Pod's node affinity/selector; debugInfo=

However, If I used the "wrong" label capacityType: SPOT, CAS will autoscale the correct MNG, but my Pod then stays Pending because the label on the actual node has the eks.amazonaws.com prefix and so my Pod's nodeAffinity differs and can't actually get scheduled into that node. 😢

The fix in here makes it so the autodiscovered label is the same one that's placed on the nodes once the MNG scales up for the first time.

Note: This does not happen when there's at least one node in the MNG, as CAS will then cache the correct set of labels, since it will circumvent the need for EKS's DescribeNodegroup API.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Fixed an issue where the capacityType label inferred from an empty AWS ManagedNodeGroup does not match the same label on the nodes after it scales from 0 -> 1.

Fixes an issue where the capacityType label inferred from an empty EKS ManagedNodeGroup does not match the same label on the node after it is created and joins the cluster

vadasambar · 2023-11-14T18:06:02Z

/assign @vadasambar

cluster-autoscaler/cloudprovider/aws/aws_wrapper.go

cluster-autoscaler/cloudprovider/aws/aws_wrapper_test.go

vadasambar · 2023-11-15T18:24:16Z

A nit, LGTM otherwise. Description is easy to understand (thank you)

vadasambar · 2023-11-15T18:24:26Z

/lgtm

vadasambar · 2023-11-15T18:26:15Z

@drmorr0 , @gjtempleton need your approval.

gjtempleton · 2023-12-04T13:09:49Z

Thanks for this!
/lgtm
/approve

k8s-ci-robot · 2023-12-04T13:10:10Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gjtempleton, sibucan, vadasambar

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~cluster-autoscaler/cloudprovider/aws/OWNERS~~ [gjtempleton]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. area/provider/aws Issues or PRs related to aws provider labels Nov 8, 2023

k8s-ci-robot requested review from drmorr0 and jbartosik November 8, 2023 16:36

k8s-ci-robot added area/cluster-autoscaler area/vertical-pod-autoscaler size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Nov 8, 2023

sibucan force-pushed the aws-capacity-type-fix branch from 7cd73bc to 17e99a2 Compare November 8, 2023 16:55

k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Nov 8, 2023

Fix capacityType label in AWS ManagedNodeGroup

7ffb662

Fixes an issue where the capacityType label inferred from an empty EKS ManagedNodeGroup does not match the same label on the node after it is created and joins the cluster

sibucan force-pushed the aws-capacity-type-fix branch from 17e99a2 to 7ffb662 Compare November 8, 2023 17:39

k8s-ci-robot assigned vadasambar Nov 14, 2023

vadasambar mentioned this pull request Nov 14, 2023

Nov 2023 vadafoss/daily-updates#15

Closed

vadasambar reviewed Nov 15, 2023

View reviewed changes

cluster-autoscaler/cloudprovider/aws/aws_wrapper.go Show resolved Hide resolved

vadasambar suggested changes Nov 15, 2023

View reviewed changes

cluster-autoscaler/cloudprovider/aws/aws_wrapper_test.go Show resolved Hide resolved

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 15, 2023

vadasambar approved these changes Nov 15, 2023

View reviewed changes

gjtempleton removed the area/vertical-pod-autoscaler label Nov 16, 2023

k8s-ci-robot assigned gjtempleton Dec 4, 2023

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 4, 2023

k8s-ci-robot merged commit c0661b5 into kubernetes:master Dec 4, 2023
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix capacityType label in AWS ManagedNodeGroup #6261

Fix capacityType label in AWS ManagedNodeGroup #6261

sibucan commented Nov 8, 2023 •

edited

Loading

vadasambar commented Nov 14, 2023

vadasambar commented Nov 15, 2023

vadasambar commented Nov 15, 2023

vadasambar commented Nov 15, 2023 •

edited

Loading

gjtempleton commented Dec 4, 2023

k8s-ci-robot commented Dec 4, 2023

Fix capacityType label in AWS ManagedNodeGroup #6261

Fix capacityType label in AWS ManagedNodeGroup #6261

Conversation

sibucan commented Nov 8, 2023 • edited Loading

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

vadasambar commented Nov 14, 2023

vadasambar commented Nov 15, 2023

vadasambar commented Nov 15, 2023

vadasambar commented Nov 15, 2023 • edited Loading

gjtempleton commented Dec 4, 2023

k8s-ci-robot commented Dec 4, 2023

sibucan commented Nov 8, 2023 •

edited

Loading

vadasambar commented Nov 15, 2023 •

edited

Loading